Online `1-Dictionary Learning with Application to Novel Document Detection

نویسندگان

  • Shiva Prasad Kasiviswanathan
  • Huahua Wang
  • Arindam Banerjee
چکیده

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online `1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the `1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online `1-dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results. Our algorithm for online `1dictionary learning could be of independent interest.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast online L1-dictionary learning algorithms for novel document detection

OnlineL1-dictionary learning, introduced by Kasiviswanathan et al. [1], is the process of generating a sequence of (dictionary) matrices {At+1}, one at a time, for t = 0, 1, . . .. After committing to At+1, a pair of matrices (Pt+1, Xt+1) is revealed and the online algorithm incurs a cost of ‖Pt+1 − At+1Xt+1‖1. The goal of the online algorithm is to ensure that the total cost up to each time is...

متن کامل

A Novel Face Detection Method Based on Over-complete Incoherent Dictionary Learning

In this paper, face detection problem is considered using the concepts of compressive sensing technique. This technique includes dictionary learning procedure and sparse coding method to represent the structural content of input images. In the proposed method, dictionaries are learned in such a way that the trained models have the least degree of coherence to each other. The novelty of the prop...

متن کامل

Online L1-Dictionary Learning with Application to Novel Document Detection

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online `1-dictionary learning where unlike traditional dictionary learning, wh...

متن کامل

Novel Document Detection using Online L1-Dictionary Learning

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a robust and scalable manner. In this paper, we introduce an approach for novel document detection based on online dictionary learning. Unlike traditional...

متن کامل

Appendix to “Online `1-Dictionary Learning with Application to Novel Document Detection”

2 ‖z− Fx−Gy‖22 , where ρ ∈ R is the Lagrangian multiplier and φ > 0 is a penalty parameter. ADMM utilizes the separability form of (1) and replaces the joint minimization over x and y with two simpler problems. The ADMM first minimizes L over x, then over y, and then applies a proximal minimization step with respect to the Lagrange multiplier ρ. The entire ADMM procedure is summarized in Algori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012